Rihad Variawa

Data Scientist, Who looks at everything through a lens of numbers. A story-teller by nature and a problem-solver at the core.

Your Turn 0

Add a setup chunk that loads the tidyverse packages.

install.packages('tidyverse')
## Installing package into '/home/rstudio-user/R/x86_64-pc-linux-gnu-library/3.6'
## (as 'lib' is unspecified)
library(tidyverse)
## ── Attaching packages ───────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.0     ✔ purrr   0.3.2
## ✔ tibble  2.1.3     ✔ dplyr   0.8.1
## ✔ tidyr   0.8.3     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ── Conflicts ──────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
mpg
## # A tibble: 234 x 11
##    manufacturer model displ  year   cyl trans drv     cty   hwy fl    class
##    <chr>        <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
##  1 audi         a4      1.8  1999     4 auto… f        18    29 p     comp…
##  2 audi         a4      1.8  1999     4 manu… f        21    29 p     comp…
##  3 audi         a4      2    2008     4 manu… f        20    31 p     comp…
##  4 audi         a4      2    2008     4 auto… f        21    30 p     comp…
##  5 audi         a4      2.8  1999     6 auto… f        16    26 p     comp…
##  6 audi         a4      2.8  1999     6 manu… f        18    26 p     comp…
##  7 audi         a4      3.1  2008     6 auto… f        18    27 p     comp…
##  8 audi         a4 q…   1.8  1999     4 manu… 4        18    26 p     comp…
##  9 audi         a4 q…   1.8  1999     4 auto… 4        16    25 p     comp…
## 10 audi         a4 q…   2    2008     4 manu… 4        20    28 p     comp…
## # … with 224 more rows

Your Turn 1

Run the code on the slide to make a graph. Pay strict attention to spelling, capitalization, and parentheses!

ggplot(data = mpg) +
  geom_point(mapping = aes(x = displ, y = hwy))

Your Turn 2

Add color, size, alpha, and shape aesthetics to your graph. Experiment.

ggplot(data = mpg) +
  geom_point(mapping = aes(x = displ, y = hwy, color = class))

Help Me

What do facet_grid() and facet_wrap() do? (run the code, interpret, convince your group)

# Facets - subplots that display subsets of the data
# Makes a plot that the commands below will modify
q <- ggplot(mpg) + geom_point(aes(x = displ, y = hwy))

q + facet_grid(. ~ cyl)

q + facet_grid(drv ~ .)

q + facet_grid(drv ~ cyl)

q + facet_wrap(~ class)

facet_grid() - 2D grid, rows ~cols, for no split facet_wrap() - 1D ribbon wrapped into 2D

Your Turn 3

Replace this scatterplot with one that draws boxplots. Use the cheatsheet. Try your best guess.

ggplot(mpg) + geom_point(aes(class, hwy))

Your Turn 4

Make a histogram of the hwy variable from mpg. Hint: do not supply a y variable.

ggplot(data = mpg) +
  geom_histogram(mapping = aes(x = hwy))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Your Turn 5

Use the help page for geom_histogram to make the bins 2 units wide.

ggplot(data = mpg) +
  geom_histogram(mapping = aes(x = hwy), binwidth = 2)

Your Turn 6

Make a bar chart hwy colored by class. Use the help page for geom_bar to choose a “color” aesthetic for class.

ggplot(data = mpg) +
  geom_bar(mapping = aes(x = class, color = class))

Quiz

What will this code do?

ggplot(mpg) + 
  geom_point(aes(displ, hwy)) +
  geom_smooth(aes(displ, hwy))
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Quiz

What is different about this plot? Run the code!

p <- ggplot(mpg) + 
  geom_point(aes(displ, hwy)) +
  geom_smooth(aes(displ, hwy))

library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
ggplotly(p)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Your Turn 7

What does getwd() return?

getwd()
## [1] "/cloud/project/02-Visualize"

Take aways

You can use this code template to make thousands of graphs with ggplot2.

ggplot(data = <DATA>) +
  <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))